Backing out of a processor architectural state

ABSTRACT

A processor having a plurality of registers is provided. The processor is capable of re-executing at least one selected instruction by backing out of an architectural register state. A method is provided for backing a processor out of an architectural state. The method comprises reassigning a register to a logical operand of an instruction, the register having been assigned to the logical operand in a previous architectural state; and re-executing the instruction.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates generally to computers andprocessors, and more specifically, to delaying the deallocation ofregisters and backing out of architectural states.

[0003] 2. Description of the Related Art

[0004] Processors fetch and execute a sequence of instructions frommemory. The instructions ordinarily manipulate data stored in memory orregisters. Typically, the processor decodes the instructions into firstand second types of instructions adapted to execution on particulartypes of hardware units. The first type of micro-instruction loads andstores data between the memory and registers, which are typicallyinternal to the processor. The second type of micro-instructionmanipulates data stored in the internal registers and writes the resultsfrom the manipulations back to the internal registers. Since the numberof internal registers is limited, an absence of available internalregisters may occur causing a bottleneck at the decode stage. Theprocessor ordinarily employs methods that efficiently use the internalregisters to reduce the occurrence of decode bottlenecks.

[0005] One mechanism for using the limited number of internal registersentails producing instructions through several operations. First, theprocessor decodes an incoming instruction into one or more instructionshaving logical operands. Hereafter, logical operands are defined to meandummy variables for some source and destination addresses ofinstructions. Second, an allocator assigns one or more of the availableinternal registers to the logical operands introduced in the first step.Third, a retirement unit deallocates the previously assigned internalregisters of executed instructions without substantial delay when otherinstructions no longer need to read the contents of the registers.Deallocation makes more internal registers available for assignment tonewly decoded instructions. Thus, retirement units should rapidlydeallocate registers to reduce the occurrence of instruction decodebottlenecks.

[0006] Processors also have hardware for recovering from what arereferred to as execution “exceptions”. Exceptions may be attributable tointerrupts and faults generated during execution of instructions.Recovering from an exception involves both detecting the exception andreporting the exception to hardware that may re-execute any improperlyexecuted instructions. The proper re-execution normally involvesreturning the processor to a pre-exception state. Thus, re-execution mayinclude restoring original data to internal registers and reinsertingthe excepting instruction and the instructions dependent thereupon backinto execution pipelines.

[0007] A system designed to detect and report all exceptions may employsubstantial hardware, i.e., a large area on the processor chip, and mayencumber the ordinary retirement cycle. The detection of complex faultevents may entail heavy area and time costs, because more verificationsare ordinarily employed to check for complex faults. Complex faultdetection may slow the retirement process with verifications for rarelyoccurring faults.

[0008] For a macro-instruction, I₁, decoding into a sequence μI₁, μI₂,etc., an exception may occur on both the earlier and later members ofthe sequence, e.g. μI₁, or μI₂. Two methods may be pursued to recoverfrom an exception on a later member, e.g. μI₂. First, the processor maycorrect the condition causing the exception and re-execute only theexcepting instructions by (a) detecting which instruction excepted, and(b) reinstating the initial execution state associated therewith.Second, the processor may correct the condition causing the exceptionand re-execute the entire sequence, i.e., μI′₁, μI′₂, etc., whenever anymember of the sequence registers an exception. Implementing either ofthe above methods may be problematic.

[0009] Since detecting exceptions on individual members of a sequencemay be complex, re-executing the entire sequence from decoding themacro-instruction may save time and reduce hardware needs. But, thesequence from the macro-instruction may include “retired” instructions,because earlier members, e.g., μI₁, may have completed execution. Forexample, the instruction R+R′→R destroys the original data in R when theinstruction is retired, i.e., the architectural state has changed. Thus,executing earlier members of the sequence, e.g., μI₁, may beproblematic. Prior art processors may not handle exceptions oninstructions produced by decoding a single macro-instructioninefficiently.

[0010] The present invention is directed to overcoming, or at leastreducing the effects of, one or more of the problems set forth above.

SUMMARY OF THE INVENTION

[0011] A first aspect of the present invention provides an apparatus.The processor has a plurality of registers. The processor is capable ofre-executing at least one selected instruction by backing out of anarchitectural register state. A second aspect of the present provides amethod for backing a processor out of an architectural state. The methodcomprises reassigning a register to a logical operand of an instruction,the register having been assigned to the logical operand in a previousarchitectural state; and re-executing the instruction.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] Other objects and advantages of the invention will becomeapparent upon reading the following detailed description and uponreference to the drawings in which:

[0013]FIG. 1 is a high-level block diagram of an embodiment, inaccordance with the present invention, of a processor that delays thedeallocation of a portion of the registers;

[0014]FIG. 2 is a flowchart illustrating an embodiment of a method forexecuting instructions in the processor of FIG. 1;

[0015]FIG. 3 is a high-level block diagram of an embodiment of aprocessor having a back-out register for use in delaying thedeallocation of selected registers:

[0016]FIG. 4 is a flowchart illustrating an embodiment of a method forre-executing selected instructions in the processor of FIG. 3;

[0017]FIG. 5 is a flowchart illustrating an embodiment of a method ofre-executing selected instructions in the processor of FIG. 3 by backingout of an architectural state;

[0018]FIG. 6 is high-level block diagram an embodiment of a processorwhich implements speculative execution and also backs out ofarchitectural states for re-executions involving selected registers;

[0019]FIG. 7 illustrates a time line of the register allocation tableand back-out register as instructions progress through the processor ofFIG. 6: and

[0020]FIG. 8 is a flowchart illustrating an embodiment of one method ofoperating the processor of FIG. 6.

[0021] While the invention is susceptible to various modifications andalternative forms, specific embodiments thereof have been shown by wayof example in the drawings and are herein described in detail. It shouldbe understood, however, that the description herein of specificembodiments is not intended to limit the invention to the particularforms disclosed, but on the contrary, the intention is to cover allmodifications, equivalents, and alternatives falling within the spiritand scope of the invention as defined by the appended claims.

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS

[0022] Specific embodiments of the invention are described below. In theinterest of clarity, not all features of an actual implementation aredescribed in this specification. It will of course be appreciated thatin the development of any such actual embodiment, numerousimplementation-specific decisions may be made to achieve the developers'specific goals, such as compliance with system-related andbusiness-related constraints, which may vary from one implementation toanother. Moreover, it will be appreciated that such a developmenteffort, even if complex and time-consuming, would be a routineundertaking for those of ordinary skill in the art having the benefit ofthis disclosure.

[0023] Hereafter, an architectural state is the state of a processor'sregisters and memories after writes by all executed instructionsdetermined to have executed properly, i.e., after sites by all properlyretired instructions. A speculative state is the state of processor'sregisters and memories after writes by all executed instructions, i.e.,after writes by all executed instructions whether or not theinstructions have been determined to have executed properly. A processorupdates a speculative state to an architectural state after a retirementunit determines that the instructions, which will update the state, haveproperly executed.

[0024]FIG. 1 is a high-level block diagram illustrating a portion of afirst embodiment of a processor 20 that delays the deallocation ofselected registers. In some embodiments, the selected registers includeall internal registers. An allocator 21 is a hardware device thatassigns registers 22, 23, 24 to a portion of the logical operands ofincoming instructions. In some embodiments, the registers 22, 23, 24belong to a register file 25, i.e., a hardware structure for handlingand directing accesses of the plurality of internal registers 22, 23,24. A retirement unit 26 retires instructions that have been executed byan execution unit 27. The retirement unit 26 deallocates the registers22, 23. 24 that were assigned to the executed instructions. Deallocationmakes the registers 22, 23, 24 available for assignment to new incominginstructions by the allocator 21. The retirement unit 26 delays thedeallocation of a selected portion of the registers 22, 23, 24.

[0025] Still referring to FIG. 1, the registers 22, 23, 24 may beclassified into available registers, used registers, and delayedregisters. The allocator 21 may assign an “available” register to alogical operand of an incoming instruction. The allocator 21 may notassign either “used” or “delayed” registers to the logical operands ofincoming instructions. The “used” and the “delayed” registers are notdeallocated in the sense that the allocator 21 may not reassign them toa destination logical operand of incoming instruction. By definition, atleast one instruction may read or write a “used” register. Activeinstructions may neither read nor write “delayed” registers. Theprocessor 20 saves the identifiers of “delayed” registers. Registeridentifiers are physical addresses. Thus, execution results in “delayed”registers may be accessed and used even though the instructions thatproduced the results are retired, and the results have been removed fromthe processor's architectural state.

[0026] Hereafter, a class of registers stores a type of data, e.g.,floating-point data, integer data, predicate values, multimedia data,etc. The term class may also apply to logical operands, e.g.,floating-point data, integer data, predicate values, multimedia data.etc. In various embodiments, the classes may also store types of data,which are not numerated above.

[0027]FIG. 2 is a flowchart illustrating an embodiment of a method 30for executing instructions in the processor 20 of FIG. 1. At block 31,the allocator 21 assigns a first register 23 to a first logical operandof a first instruction. At block 32, the allocator 21 assigns a secondregister 24 to a second logical operand of a second instruction. Thesecond instruction follows the first instruction in the instructionsequence. In the illustrated embodiments, the first and second logicaloperands are the same logical operand. In other embodiments, the firstand second logical operands may be different logical operands as long asthey belong to the same preselected class, e.g., floating point. Atblock 33, the execution unit 27 executes the first and secondinstructions. At block 34, the retirement unit 26 retires the executedfirst instruction. At block 35, the retirement unit 26 saves theidentifier of the first register in response to retiring the secondinstruction. The first register 23 is a “delayed” register, i.e. thecontents therein may still be retrieved.

[0028] Still referring to FIG. 1 and 2, the retirement unit 26 delaysthe deallocation and saves the identifiers of a preselected classes ofthe registers 22, 23. 24 and logical operands or of the registers ofselected classes of instructions. Different embodiments may selectdifferent classes of the registers 22, 23, 24 and logical operands ordifferent classes of instructions. In specific embodiments, theretirement unit 26 may delay the deallocation of one or of more than oneselected class of registers. In some embodiments, the retirement unit 26may delay the deallocation of the registers 22. 23, 24 associated withspecific instruction classes, e.g., one or more registers 22. 23, 24assigned to instructions resulting from the decoding of a selectedsingle macro-instruction.

[0029] Some embodiments in accordance with the invention may employ“backing out” of an architectural state in a processor thatspeculatively executes instructions. FIG. 3 is a high-level blockdiagram illustrating one embodiment of a processor 38 that includes aback-out register file 39 to delay the deallocation of selectedregisters. The back-out register file 39 has storage positions 40, 41 tosave the identifiers of the registers 22, 23, 24, previously assigned tothe selected destination logical operand of one or more retiredinstructions. The back-out register file 39 may comprise one or severalregisters. The retirement unit 26 writes the identifier of the register22, 23, 24 assigned to a destination logical operand of a selected andretired first instruction to the back-out register file 39 in responseto retiring a second instruction belonging to the same class. In someembodiments, the second instruction is an instruction having the samedestination logical operand as the first instruction. In the prior art,the previously assigned registers might have been deallocated, becauseunretired instructions may no longer read the data stored in theregister assigned to the first instruction after the retirement of thesecond instruction having the same destination logical register. Theportion of the registers 22. 23, 24 with identifiers stored in theback-out register file 39 are “delayed” registers.

[0030] Still referring to FIG. 3, the processor 38 re-executesinstructions in response to selected exceptions detected by theretirement unit 26. A decoder 44 translates incoming instructions intosequences of instructions, and sends the instructions to the allocator21. The retirement unit 26 detects selected exceptions and setsinstructions for re-execution in response to the selected exceptions.The retirement unit 26 may signal microcode 45 to prepare instructionsfor re-execution. Microcode is a combination of hardware and specializedpermanent memory, e.g., read-only memory (ROM), that performs a specialfunction and is ordinarily internal to the processor. In response to thesignal from the retirement unit 26, the microcode 45 reads the back-outregister file 39 to obtain the identifiers of the portion of theregisters 22, 23, 24 previously assigned to the selected logicaloperands. The microcode 45 produces machine code for the instructionsfor re-execution. During the re-execution, selected logical operands areassigned the portion of the registers 22, 23, 24, which were previouslyassigned and correspond to the identifiers saved in the back-outregister file 39, i.e. the “delayed registers. The microcode 45introduces the previous register assignments in the machine code of theinstructions for re-execution. This may be referred to as “backing outarchitectural register assignments.” An output line 46 sends theinstructions to re-execute from the microcode 45 to the execution unit27.

[0031]FIG. 4 is a flowchart illustrating an embodiment of a method 47for re-executing selected instructions in the processor 38 of FIG. 3.The method includes backing out of an architectural register state. Atblock 48, the processor 38 executes a first instruction having a firstregister as a destination address. At block 49, the retirement unit 26retires a second instruction having a second register 22 as adestination address. The first and second registers 23, 22 have beenassigned to the same selected logical operand by the allocator 21. Atblock 50, the retirement unit 26 makes the second register 22 a“delayed” register in response to determining that the first instructionis ready to retire. At block 51, the retirement unit 26 retires thefirst instruction, already having retired the second instruction. Atretirement, the first register 23 becomes an architectural register thatmay still be read by unretired instructions. At block 52, the processor38 re-executes a third instruction having the selected logical operandas a source or as a destination address. The third instruction may beone of the above-mentioned instructions or another instruction.Re-executing includes reassigning the second register 22, i.e. a delayedregister, to the same selected logical operand in the third instruction.By reassigning the second register to the selected logical operand,re-execution backs out of the architectural assignment.

[0032] Referring still to FIG. 4, some embodiments may deallocate aregister if another instruction having a destination register of theselected class retires. For example, at block 53, the retirement unit 26deallocates the first register 23 in response to retiring a fourthinstruction having the different register 24 assigned to the samelogical operand. In other embodiments (not shown), the storage positions40, 41 may store the identifiers of both the portion of the registers22, 23, 24 previously assigned and before previously assigned to theselected logical operands. Such an embodiment may back-out of severalchanges to the architectural register state. In other embodiments, thestorage positions 40, 41 may store the identifiers of a portion of theregisters 22, 23, 24 previously assigned to several selected logicaloperands.

[0033]FIG. 5 is a flowchart illustrating an embodiment of a method 54 ofre-executing instructions in the processor 38 of FIG. 3 by backing outof an architectural register state. Blocks 48, 49, 50, 51, and 52 weredescribed in FIG. 4. At block 55, the retirement unit 26 writes theidentity of the second register 22 to one of the positions 40, 41 of theback-out register file 39. The positions 40, 41 correspond to adestination logical operand X to which the second register was assignedby the allocator 21. At block 56, the retirement unit 26 sets a thirdinstruction for re-execution, e.g., in response to an exception. Thethird instruction has the X logical operand as a source address. Atblock 57, the microcode 45 reads the back-out register file 22 todetermine the identifier of the previously assigned register for thelogical operand X, i.e. the second register 22, and reassigns theidentifier therefrom to the logical operand X in the third instruction.At block 58, the microcode 45 redirects the decoder 44 to send the thirdinstruction for re-execution with the logical operand X replaced by thesecond register 22. Thus, the processor 38 backs out of an architecturalregister state to re-execute the third instruction.

[0034] In one embodiment the first and second instructions of FIGS. 4and 5 result from decoding an incoming “packed” floating-pointmacro-instruction. i.e. an instruction performing several floating-pointoperations in parallel, or multimedia macro-instruction. In thisembodiment, the exceptions stimulating a back-out of an architecturalstate occur on the second or later sequential instruction of the sameclass. The processor 38 of FIG. 3 recovers from exceptions on either thefirst or the second instructions by correcting and re-executing both. Insome embodiments, correcting and re-executing both instructions may thetime and hardware used to detect exceptions. The architectural registerstate is no longer proper for re-executing the first or earliersequential instruction, but the back-out register file 39 enables theprocessor 38 to restore the proper register state.

[0035] Some embodiments may increase efficiency by re-executing allinstructions coming from decoding a selected macro-instructions evenwhen only a subset of the instructions encounter exceptions. This methodmay reduce the amount of hardware employed for detecting exceptions.Similarly, less operating time may be used to determine whether any, asopposed to which, of the selected instructions encountered an exception.In some embodiments, the time costs to individually detect the selectedexceptions are high, and the selected exceptions are rare. Then, theadded time to re-execute all the instructions coming from decoding asingle macro-instruction may be less than the total time saved. Then,re-execution by the back-out methods of FIGS. 4 and 5 may increase theeffective performance of a processor.

[0036]FIG. 6 is a high-level block diagram illustrating an embodiment ofan out-of-order processor 60 that employs speculative execution and alsobacks out of some architectural states for re-executions involvingselected registers 22, 23, 24. A line 61 brings incoming instructions toa decoder 64. The decoder 63 includes a multiplexer (MUX) 63 havingfirst and second input ports 62, 102. The first and second input ports62, 100 receive newly decoded instructions and instructions forre-execution, respectively. The decoder 64 sends instructions from anoutput port 65 of the MUX 63 to the allocator 21. The allocator 21 maywrite and read identifiers of a portion of the registers 22, 23, 24 toand from a register allocation table (“RAT”) 66. The rows 67, 68, 69 ofthe RAT 66 have both speculative and architectural assignment positions70, 71 to store the identifiers of the portion of the registers 22, 23,24 assigned to the destination logical operands of the instructions. Theexecution units 27, 72 in the particular embodiment of FIG. 6 may alsoexecute the instructions out-of-order. A reorder queue (“ROQ”) 73 savesthe original instruction sequence so that retirement of executedinstructions, may be performed in-order. The retirement unit 26 maywrite the identifiers of selected classes of the registers 22, 23, 24 toa back-out register 74 having one or more storage positions (not shown).

[0037] Still referring to FIG. 6, the allocator 21 assignments areinitially speculative. The retirement unit 26 flushes unretiredinstructions from portions of the processor 60 between the allocator 21and the retirement unit 26 in response to certain exceptions. To recoverfrom the exceptions, the processor 60 may copy the entries of thearchitectural assignment positions 71 to the speculative assignmentpositions 70. Then, re-execution of unretired and improperly executedinstructions may start from the earlier state defined by thearchitectural register assignments. The speculative assignments becomearchitectural in response to the proper retirement of the instruction towhich the assignments were made. Thus, re-execution in response to suchexceptions, as opposed to the selected exceptions of FIGS. 1-5, does notentail backing out of the “architectural” state defined by theassignments of retired instructions .

[0038]FIG. 7 is a time line 80 of the RAT 66 and the back-out register74 as instructions I₀ and I₁ progress through of the embodiment of theprocessor 60 illustrated in FIG. 6. At block 82, the instruction I₀retires. At block 84, the row 68 of the RAT 66 for the logical operand Xstores the identifier of the register R₂ in both the speculative and thearchitectural assignment positions 70, 71, because allocator 21 hadassigned register R₂ to I₀. The speculative and architectural assignmentpositions 70, 71 may store identical identifiers between the retirementof an instruction and the allocation of a new register 22, 23, 24 to asecond instruction having the same destination logical operand as thefirst instruction.

[0039] At block 84 of FIG. 7, the entries R₂, R₃, and R₄ of the RAT 66are “used” registers, meaning they may be read by active and/or incominginstructions. Active and incoming instructions may read data from theregisters R₂ and R₃ in the speculative assignment positions 70. Activeand incoming instructions may also read the registers R₂ and R₄ in thearchitectural assignments positions 71 if the retirement unit 26 copiesarchitectural register assignments to corresponding speculativeassignments in response to an exception. As discussed in respect to FIG.6, this corresponds to a re-execution without a back out from anarchitectural state, which is instituted for certain exceptions in theembodiment of FIG. 6. The register identifiers R₂, R₃, and R₄ in eitherthe speculative or the architectural assignment positions 70, 71correspond to physical addresses of “used” registers 22, 23, 24, becauseunretired instructions may read the data stored therein in thisembodiment.

[0040] Still referring to FIG. 7, the allocator 21 assigns the registerR₁ to the logical operand X of the instruction I₁ at block 86. At block88, the speculative assignment position 70 for the logical operand Xstores the identifier R₁ in response to assignment of block 86. At block90, the instruction I₁ retires without exceptions. At block 92, theretirement unit 26 writes the identifier R₁ to the architecturalassignment position 71 for the logical operand X and writes theidentifier R₂, from the previous architectural assignment for X, to theback-out register 74. The register R₂ is a “delayed” register, asdefined above, because unretired instructions may not read R₂ even inresponse to an exception. The register R₂ may be read if the processor60 performs a re-execution by backing out of the writes by retiredinstructions. i.e., instructions that were properly executed.

[0041] Referring back to FIG. 6, the processor 60 may handle a selectedclass of exceptions in a manner that includes backing out of writes byselected retired instructions, i.e., instructions that have beendetermined to have properly executed. In one embodiment, the processor60 backs out of writes by the retired instructions to execute a newinstruction in response to the selected class of exceptions. The newinstruction is executed in a “previous” architectural register state.The back-out register 74 stores register assignments for logicaloperands of the selected retired instructions. These registerassignments enable backing out of the present architectural registerstate so that the execution of the new instruction can be performed withthe “previous” architectural state.

[0042] In some embodiments, back-out execution enables the processor 60to execute an entire sequence of instructions in a previousarchitectural register state. For example, one embodiment performs aback-out execution of a new sequence of instructions. i.e., μI′₁, μI′₂,etc., in response to exceptions occurring on any instruction of aselected sequence μI₁, μI₂, etc., wherein the sequence comes fromdecoding one macro-instruction. The new sequence μI′₁, μI′₂, may differfrom the original sequence, μI₁, μI₂, etc., to correct the problems thatcaused the exception. In this embodiment, the processor 60 effectivelyre-executes all of the sequence. e.g., μI₁ μI₂, etc., even though thearchitectural state has changed due to the retirement of earlierinstructions of the sequence, i.e. instructions not registeringexceptions. In some embodiments, such a procedure may reduce thehardware and time costs employed for detecting the selected exceptions.

[0043] Referring to FIG. 6, the retirement unit 26 delays thedeallocation of selected registers 22, 23, 24 of previously retiredinstructions by transferring the corresponding identifiers of theregisters 22, 23, 24 from the architectural assignment positions 71 tothe back-out register 74. The retirement unit 26 does not inform theallocator 21 that the delayed registers 22, 23, 24 are available. Theretirement unit 26 writes the identifiers of the registers 22, 23,24 ofthe previously retired instructions to the back-out register 74 inresponse to determining that a later instruction, having the samedestination logical operand, is ready to retire.

[0044] Referring still to FIG. 6, the decoder 64 receives instructionsfor back-out re-execution from line 100. The retirement unit 26 directsthe back-out re-execution by a signal to a select input port 102 of theMUX 63. The selected logical operands of the instructions for back-outre-execution are assigned register identifiers from the back-outregister 74. For example, the logical operand X becomes the register R₂in the example of block 92 in FIG. 7. In some embodiments, microcode(not shown) creates the machine code for the instructions for back-outre-execution. The machine code may also contain one or more bits thatdirect the allocator 21 not to assign other registers 22, 23, 24 tological operands already assigned identifiers of “delayed” registers.

[0045]FIG. 8 is a flowchart illustrating an embodiment 110 of a methodof operating of the processor 60 of FIG. 6. At block 112, the allocator21 receives a first instruction having a destination logical operand X.At block 114, the allocator 21 assigns a first register to the logicaloperand X and writes the corresponding first identifier thereof to thespeculative assignment position 70 in the RAT 66 for X. Subsequentinstructions with the source logical operand X will read the firstregister. At block 116, the retirement unit 26 retires the executedfirst instruction and writes the first identifier to the architecturalassignment position 71 for X. At block 118, the allocator 21 writes asecond identifier, corresponding to a second register, to speculativeassignment position 70 for X in response to the second instructionhaving the address logical operand X. At block 120, the retirement unit26 writes the first identifier from the architectural assignmentposition 71 for X to the back-out register 74 in response to determiningthat the second instruction is ready to retire. Now, the first registeris a delayed register, and active and/or incoming instructions mayneither read or write from or to the first register. At block 122, theretirement unit 26 writes the second identifier to the architecturalassignment position 71 for X in response to retiring the secondinstruction. At block 124, some embodiments deallocate the firstregister in response retiring another instruction with the destinationlogical operand X.

[0046] The particular embodiments disclosed above are illustrative only,as the invention may be modified and practiced in different butequivalent manners apparent to those skilled in the art having thebenefit of the teachings herein. Furthermore, no limitations areintended to the details of construction or design herein shown, otherthan as described in the claims below. It is therefore evident that theparticular embodiments disclosed above may be altered or modified andall such variations are considered within the scope and spirit of theinvention. Accordingly, the protection sought herein is as set forth inthe claims below.

What is claimed is:
 1. A apparatus, comprising: a processor having aplurality of registers, the processor being capable of executing atleast one selected instruction by backing out of an architecturalregister state.
 2. The apparatus as set forth in claim 1, the processorfurther comprising: at least one back-out register adapted to store anidentifier of a delayed register, the processor capable of reassigningthe identifier to a logical operand of the selected instruction.
 3. Theapparatus as set forth in claim 1, wherein the processor is capable ofexecuting a plurality of instructions speculatively.
 4. The apparatus asset forth in claim 1, the processor further comprising a retirement unitto delay deallocation of a first one of the registers in response to theretirement of a second one of the registers.
 5. A method for backing aprocessor out of an architectural state, the method comprising:reassigning a register to a logical operand of an instruction, theregister having been assigned to the logical operand in a previousarchitectural state; and executing the instruction.
 6. The method as setforth in claim 5, wherein the act of reassigning assigns a register thatis neither used or available.
 7. The method as set forth in claim 5,further comprising delaying deallocation of the register in response toretiring a second instruction.
 8. The method as set forth in claim 7,wherein the act of delaying comprises inserting an identifier of theregister in a back-out register and wherein the act of reassigningincludes writing the identifier from the back-out register into amachine code for the instruction.
 9. A processor, comprising: aplurality of registers; an allocator to assign the registers to logicaloperands of instructions; at least one execution unit; and a retirementunit to retire instructions executed by the execution unit and todeallocate the registers assigned to logical operands of instructions,the retirement unit being capable of changing at least one of theregisters assigned to a first instruction to a delayed register.
 10. Theprocessor as set forth in claim 9, wherein the retirement unit isadapted to delay the deallocation of the one of the registers in theabsence of an instruction capable of reading the one of the registers.11. The processor as set forth in claim 9, wherein the plurality ofregisters belong to a preselected class.
 12. The processor as set forthin claim 9, wherein the plurality of registers include at least one of afloating-point register and a multimedia register.
 13. The processor asset forth in claim 9, wherein the retirement unit is adapted to delaythe deallocation of a first register assigned to a destination logicaloperand of a first retired instruction in response to determining that asecond instruction is ready to retire, the second instruction having asecond register assigned to the destination logical operand.
 14. Theprocessor as set forth in claim 9, further comprising a back-outregister having at least one storage position, the retirement unit towrite an identifier of the one of the registers to the storage positionto change the one of the registers to a delayed register.
 15. Theprocessor as set forth in claim 14, wherein the storage positioncorresponds to a particular logical operand.
 16. The processor as setforth in claim 9, wherein the one of the registers is assigned to afirst instruction and wherein the retirement unit is capable of changingthe one of the registers to a delayed register in response to retiring asecond instruction, the second instruction having a second register, thefirst and second registers being architectural and speculativeregisters, respectively, assigned to the same destination logicaloperand.
 17. A processor capable of backing out of an architecturalstate, comprising: a plurality of registers; a decoder; an allocator toassign the registers to destination logical operands of a portion ofinstructions received from the decoder; at least one execution unit toexecute a portion of instructions; and a retirement unit to set selectedinstructions for back-out re-execution and to delay deallocation of afirst register assigned to a first retired instruction in response toretiring a second instruction, a second register assigned to the secondinstruction.
 18. The processor as set forth in claim 17, wherein thefirst and second registers correspond to the same destination logicaloperand in the first and second instructions, respectively.
 19. Theprocessor as set forth in claim 17, wherein the retirement unit isadapted to assign the first register to a third instruction in responseto sending the third instruction for back-out execution, the firstregister being assigned to the same logical operand in the thirdinstruction and the first instruction.
 20. The processor as set forth inclaim 17, wherein the retirement unit is adapted to set instructions forback-out execution in response to selected exceptions.
 21. The processoras set forth in claim 17, further comprising: a back-out register, theretirement unit adapted to write an identifier of the first register tothe back-out register in response to retiring the second instruction;and microcode to receive the identifier from the back-out register andto insert the identifier to one of the selected instructions in responsethe one of the selected instructions being sent for back-out execution,the one of the selected instructions having a logical operand, the firstregister being assigned to the logical operand in the first instruction.22. A method, comprising: allocating a first register to a logicaloperand in a first instruction; allocating a second register to thelogical operand in a second instruction; executing the first and secondinstructions; retiring the first instruction; and saving the identifierof the first register in response to retiring the second instruction.23. The method as set forth in claim 22, wherein the act of savingincludes delaying the deallocation of the first register.
 24. The methodas set forth in claim 22, further comprising executing a thirdinstruction, the third instruction having the logical operand as anaddress, the act of executing including reassigning the first registerto the logical operand in the third instruction.
 25. The method as setforth in claim 24, wherein the act of executing a third instruction isin response to detecting a preselected exception.
 26. The method as setforth in claim 24, wherein the act of executing a third instruction isperformed in response to an exception on a fourth instruction, thefourth instruction being a non-leading instruction in a sequence ofinstructions generated by decoding a selected macro-instruction.
 27. Themethod as set forth in claim 22, wherein the act of saving includestransferring the identifier for the first register from thearchitectural register assignment position in a register allocationtable to a back-out register.
 28. A method, comprising: executing afirst instruction having a first register assigned to a destinationlogical operand; retiring a second instruction assigned a secondregister to the destination logical operand; delaying the deallocationof the second register in response to determining that the firstinstruction is ready to retire; retiring the first instruction; andexecuting a third instruction having the logical operand as a source ordestination address, the act of executing including assigning the secondregister to the logical operand.
 29. The method as set forth in claim28, wherein the act of executing the third instruction is in response toa preselected exception.
 30. The method as set forth in claim 28,wherein the act of delaying includes writing an identifier of the secondregister to a position in a back-out register, the positioncorresponding to the logical operand.
 31. The method as set forth inclaim 28, wherein the act of executing includes reading the identifierfrom the back-out register and writing the identifier to the positionfor the logical operand in a machine code for the third instruction. 32.The method as set forth in claim 28, wherein the act of executingincludes redirecting the instruction flow to receive the thirdinstruction.
 33. The method as set forth in claim 28, wherein the act ofexecuting the third instruction includes re-executing a properly retiredinstruction.